The Hardness of Approximation of Euclidean k-Means

نویسندگان

  • Pranjal Awasthi
  • Moses Charikar
  • Ravishankar Krishnaswamy
  • Ali Kemal Sinop
چکیده

The Euclidean k-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of n points in Euclidean space R, and the goal is to choose k center points in R so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general k and a (1 + )-approximation which runs in time poly(n) exp(k/ ). At the other extreme, the only known computational complexity result for this problem is NP-hardness [1]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in R can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all k, d. In this paper we provide the first hardness of approximation for the Euclidean k-means problem. Concretely, we show that there exists a constant > 0 such that it is NP-hard to approximate the k-means objective to within a factor of (1 + ). We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to trianglefree graphs. To show this we transform G, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph H, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest. 1998 ACM Subject Classification F.2.2 Nonnumerical Algorithms and Problems

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximation Algorithms for Bregman Clustering Co-clustering and Tensor Clustering

The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely investigated. More recently, generalizations such as Bregman k-means [8], co-clustering [10], and tensor (multi-way) clustering [40] have also gained prominence. A well-known computational difficulty encountered by these clustering problems is the NP-Hardness of the associated optimization task, ...

متن کامل

1 0 Fe b 20 09 Approximation Algorithms for Bregman Co - clustering and Tensor Clustering ∗

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9, 17], and tensor clustering [8, 32]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximat...

متن کامل

ar X iv : 0 81 2 . 03 89 v 3 [ cs . D S ] 1 5 M ay 2 00 9 Approximation Algorithms for Bregman Co - clustering and Tensor Clustering

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9, 18], and tensor clustering [8, 34]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximat...

متن کامل

Approximation Algorithms for Bregman Co-clustering and Tensor Clustering

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9, 18], and tensor clustering [8, 34]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximat...

متن کامل

Hardness and Non-Approximability of Bregman Clustering Problems

We prove the computational hardness of three k-clustering problems using an (almost) arbitrary Bregman divergence as dissimilarity measure: (a) The Bregman k-center problem, where the objective is to find a set of centers that minimizes the maximum dissimilarity of any input point towards its closest center, and (b) the Bregman k-diameter problem, where the objective is to minimize the maximum ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015